You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been exploring SONAR's multilingual capabilities and am impressed by its ability to handle diverse languages through its encoder-decoder architecture. I'm wondering if it would be possible to extend SONAR to support structured languages, such as programming languages or other context-free grammars, by treating them as new languages in the system.
Given SONNAR's language-agnostic design and the use of SentencePiece tokenization, it seems theoretically possible to train SONAR to handle structured languages by defining them as new language codes (e.g., "py_Code" for Python, "java_Code" for Java, or "cfg_Form" for formal grammars).
Therefore, may I ask if it is possible to do the following:
training the structured language as a new 'language' to encode and decode the expression?
Will any modifications be needed to handle strict syntactic rules?
If so, may I further ask how I can possibly add a new language to SONNAR, i.e., the training recipe?
Looking forward to hearing from you :)
The text was updated successfully, but these errors were encountered:
I've been exploring SONAR's multilingual capabilities and am impressed by its ability to handle diverse languages through its encoder-decoder architecture. I'm wondering if it would be possible to extend SONAR to support structured languages, such as programming languages or other context-free grammars, by treating them as new languages in the system.
Given SONNAR's language-agnostic design and the use of SentencePiece tokenization, it seems theoretically possible to train SONAR to handle structured languages by defining them as new language codes (e.g., "py_Code" for Python, "java_Code" for Java, or "cfg_Form" for formal grammars).
Therefore, may I ask if it is possible to do the following:
If so, may I further ask how I can possibly add a new language to SONNAR, i.e., the training recipe?
Looking forward to hearing from you :)
The text was updated successfully, but these errors were encountered: