Skip to content

Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English

License

Notifications You must be signed in to change notification settings

pengzhendong/g2p-mix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

g2p-mix

Usage

$ pip install g2p-mix
$ python

Mandarin

>>> from g2p_mix import G2pMix
>>> G2pMix().g2p("你这个idea, 不太make sense。", sandhi=True)
[
  { "word": "", "phones": ["n", "i3"], "lang": "ZH" },
  { "word": "", "phones": ["zh", "e4"], "lang": "ZH" },
  { "word": "", "phones": ["g", "e4"], "lang": "ZH" },
  { "word": "idea", "phones": ["AY0", "D", "IY1", "AH0"], "lang": "EN" },
  { "word": ",", "phones": ",", "lang": "SYM" },
  { "word": "", "phones": ["b", "u2"], "lang": "ZH" },
  { "word": "", "phones": ["t", "ai4"], "lang": "ZH" },
  { "word": "make", "phones": ["M", "EY1", "K"], "lang": "EN" },
  { "word": "sense", "phones": ["S", "EH1", "N", "S"], "lang": "EN" },
  { "word": "", "phones": "", "lang": "SYM" }
]

Cantonese

>>> G2pMix(jyut=True).g2p("你这个idea, 不太make sense。")
[
  { "word": "", "phones": ["n", "ei5"], "lang": "ZH" }
  { "word": "", "phones": ["z", "e3"], "lang": "ZH" }
  { "word": "", "phones": ["g", "o3"], "lang": "ZH" }
  { "word": "idea", "phones": ["AY0", "D", "IY1", "AH0"], "lang": "EN" }
  { "word": ",", "phones": ",", "lang": "SYM" }
  { "word": "", "phones": ["b", "at1"], "lang": "ZH" }
  { "word": "", "phones": ["t", "aai3"], "lang": "ZH" }
  { "word": "make", "phones": ["M", "EY1", "K"], "lang": "EN" }
  { "word": "sense", "phones": ["S", "EH1", "N", "S"], "lang": "EN" }
  { "word": "", "phones": "", "lang": "SYM" }
]

About

Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages