English-to-Arabic Translation Issue #16

Amshaker · 2023-12-02T15:21:22Z

Thank you so much for open-sourcing your awesome models. I have a question please, I want to use AraT5 or AraT5v2 for machine translation from English to Arabic. Could you please share an example to do that? I tried to use your models with the following code but the output does not make sense. Here is the code:

from transformers import T5Tokenizer, AutoModelForSeq2SeqLM, AutoTokenizer, pipeline


model = AutoModelForSeq2SeqLM.from_pretrained("UBC-NLP/AraT5-msa-base")
tokenizer = AutoTokenizer.from_pretrained("UBC-NLP/AraT5-msa-base")
tokenizer.src_lang="English"
tokenizer.tgt_lang="Arabic"

ar_prompt="The scene displays a group of people gathered around a wooden dining table in an indoor setting."
input_ids = tokenizer(ar_prompt, return_tensors="pt").input_ids


outputs = model.generate(input_ids)
print("Tokenized input:", tokenizer.tokenize(ar_prompt))
print("Decoded output:", tokenizer.decode(outputs[0], skip_special_tokens=True))

This is the current output:

Tokenized input: ['▁The', '▁scene', '▁display', 's', '▁a', '▁group', '▁of', '▁people', '▁gathered', '▁around', '▁a', '▁wooden', ```
▁di', 'ning', '▁table', '▁in', '▁an', '▁indoor', '▁setting', '.']

Decoded output: هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب هحب

Please let me know what the issue is in the above code or share an example of how to use your model for translation from English to Arabic.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

English-to-Arabic Translation Issue #16

English-to-Arabic Translation Issue #16

Amshaker commented Dec 2, 2023

English-to-Arabic Translation Issue #16

English-to-Arabic Translation Issue #16

Comments

Amshaker commented Dec 2, 2023