Skip to content

My Encoder-Decoder Transformer Implementation in PyTorch (self-implemented multi-head attention)

Notifications You must be signed in to change notification settings

Kongfha/Vanilla-Transformer-in-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vanilla Transformer in PyTorch

Overview

Implementation of a self-made Encoder-Decoder Transformer in PyTorch (Multi-Head Attention is implemented too), inspired by "Attention is All You Need." Primarily designed for Neural Machine Translation (NMT), specifically for Chinese to Thai translation.

Table of Contents

Installation

To install the required dependencies, run the following command:

pip install -r requirements.txt

Usage

Before using the model, make sure to download the necessary datasets and preprocess them accordingly. You can find an example of how to structure your data in the data.py file.

To train the model, execute:

python train.py

For testing, you can use the example script:

python test_example.py

Make sure to customize the paths and configurations in the param.py and data.py file to match your setup as these codes were used for training Chinese to Thai translation model.

File Structure

The project structure is organized as follows:

  • models/: Contains the core components of the Transformer.
    • layers/: Various layers used in the model.
    • model/: Implementation of the Encoder, Decoder, and the complete Transformer architecture.
  • train.py: Script for training the model.
  • test_example.py: Script for testing the trained model on an example.

Training

Adjust the hyperparameters and configurations in the param.py file before starting the training process. The trained model will be saved in the models/ directory.

Testing

Test the model using the testing script test_example.py. Modify the input examples to observe the model's outputs.

References

  1. Attention is All You Need Paper

  2. Transformer Implementation by Kevin Ko

  3. The Illustrated Transformer by Jay Alammar

  4. TensorFlow Transformer Tutorial

About

My Encoder-Decoder Transformer Implementation in PyTorch (self-implemented multi-head attention)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages