Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

English-to-Arabic Translation Issue #16

Open
Amshaker opened this issue Dec 2, 2023 · 0 comments
Open

English-to-Arabic Translation Issue #16

Amshaker opened this issue Dec 2, 2023 · 0 comments

Comments

@Amshaker
Copy link

Amshaker commented Dec 2, 2023

Hi @Nagoudi @elmadany ,

Thank you so much for open-sourcing your awesome models. I have a question please, I want to use AraT5 or AraT5v2 for machine translation from English to Arabic. Could you please share an example to do that? I tried to use your models with the following code but the output does not make sense. Here is the code:

from transformers import T5Tokenizer, AutoModelForSeq2SeqLM, AutoTokenizer, pipeline


model = AutoModelForSeq2SeqLM.from_pretrained("UBC-NLP/AraT5-msa-base")
tokenizer = AutoTokenizer.from_pretrained("UBC-NLP/AraT5-msa-base")
tokenizer.src_lang="English"
tokenizer.tgt_lang="Arabic"

ar_prompt="The scene displays a group of people gathered around a wooden dining table in an indoor setting."
input_ids = tokenizer(ar_prompt, return_tensors="pt").input_ids


outputs = model.generate(input_ids)
print("Tokenized input:", tokenizer.tokenize(ar_prompt))
print("Decoded output:", tokenizer.decode(outputs[0], skip_special_tokens=True))

This is the current output:

Tokenized input: ['▁The', '▁scene', '▁display', 's', '▁a', '▁group', '▁of', '▁people', '▁gathered', '▁around', '▁a', '▁wooden', ```
▁di', 'ning', '▁table', '▁in', '▁an', '▁indoor', '▁setting', '.']

Decoded output: هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب

Please let me know what the issue is in the above code or share an example of how to use your model for translation from English to Arabic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant