Skip to content

shifs999/Transformer-Translation

Repository files navigation

Chatur-GPT

  • This project is an implementation of the Attention is all you Need paper.

Model Description

  • Transformer Based (124M): Chatur-GPT is a Generative Pre-Trained model built from scratch which is trained on a multilingual dataset, enabling effective translation across various languages.

  • Visualization: Added an attention visualizer to dynamically display the attention patterns of the pre-trained model, providing insights into how the model interprets and prioritizes different parts of input sequences.

  • Dataset: The model has been trained on Opus Books a Multilingual dataset imported from Link

  • Metrics: The model has been trained for 20 epochs and the weights have been stored which are later used for inference and visualization. A wordlevel tokenizer has been used for text pre-processing and an input embedding of 512 dimensions is used for spatial positioning. The model was able to achieve a BLEU score of 0.68, indicating strong translation accuracy and context preservation across languages.

How to run

  • Install the requirements.
pip install -r requirements.txt
  • To inference the model, run Inference.ipynb
run Inference.ipynb
  • To change the language of translation
>> Go to config.py
>> Change "lang_tgt" to desired language (the parameters are customizable)
  • To train the model
python train.py

Note: The code written is device agnostic (can run on both a cpu and a gpu)

Architecture

Transformer_img

Overview

Transformer

GoogleAI

Output

Output

  • Check the output.txt file for more examples (en-it translation)

Note: The output can be syntactically and semantically enhanced by running it for more epochs ~ 30.

Attention Visualizer

visualization

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published