Skip to content

Training neural language models on multi-language corpus.

Notifications You must be signed in to change notification settings

Kongfha/GPT2-TH-ZH

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GPT2-TH-ZH

This project aims to create a text-generation model using the GPT-2 architecture that can generate text in two languages - Thai language (TH) and Chinese language (ZH). The following examples illustrate the model's capabilities we expect:

  • <th> ทำไมฉันต้องช่วย? <zh> 我何必帮那么多忙? <|endoftext|>
  • <th> เช่น ความคิดถึง <zh> 比如 ,怀旧。 <|endoftext|>
  • <th> ทีนี้ ช่วงการลงจอดของวิถีโคจรทั้งหมดคือ เจ็ด ชั่วโมง <zh> 整个着陆过程花了 7 个小时 。 <|endoftext|>

Our goal is to develop a text-generation model that can generate Thai text with its corresponding Chinese translation. The project consists of a model source code and dataset that will be used to train the GPT-2 language model.

The model will be trained to generate text in both Thai and Chinese, with the Thai text being the primary language and the Chinese text being the translation. By achieving this goal, we hope to understand how neural language model works on multi-language corpus.

About

Training neural language models on multi-language corpus.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published