Skip to content

Latest commit

 

History

History
77 lines (48 loc) · 2.42 KB

README.md

File metadata and controls

77 lines (48 loc) · 2.42 KB

Vanilla Transformer in PyTorch

Overview

Implementation of a self-made Encoder-Decoder Transformer in PyTorch (Multi-Head Attention is implemented too), inspired by "Attention is All You Need." Primarily designed for Neural Machine Translation (NMT), specifically for Chinese to Thai translation.

Table of Contents

Installation

To install the required dependencies, run the following command:

pip install -r requirements.txt

Usage

Before using the model, make sure to download the necessary datasets and preprocess them accordingly. You can find an example of how to structure your data in the data.py file.

To train the model, execute:

python train.py

For testing, you can use the example script:

python test_example.py

Make sure to customize the paths and configurations in the param.py and data.py file to match your setup as these codes were used for training Chinese to Thai translation model.

File Structure

The project structure is organized as follows:

  • models/: Contains the core components of the Transformer.
    • layers/: Various layers used in the model.
    • model/: Implementation of the Encoder, Decoder, and the complete Transformer architecture.
  • train.py: Script for training the model.
  • test_example.py: Script for testing the trained model on an example.

Training

Adjust the hyperparameters and configurations in the param.py file before starting the training process. The trained model will be saved in the models/ directory.

Testing

Test the model using the testing script test_example.py. Modify the input examples to observe the model's outputs.

References

  1. Attention is All You Need Paper

  2. Transformer Implementation by Kevin Ko

  3. The Illustrated Transformer by Jay Alammar

  4. TensorFlow Transformer Tutorial