Timings for training the indicTrans model. #56
Unanswered
upadrasta84
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear all,
We have taken several stabs at getting this model to work as per the README file in this repository. We have tried several AWS EC2 instance configurations. Our latest is a g3.8xlarge with 32vcpu, 244gb ram, and 2 GPUs.
I believe I created the experiment directory properly. But hope someone can validate it:
we created a dir called en-indic-exp. Under it, we have the training dirs such as en-ta, en-te, en-hi etc. inside each of them, we have train.en and train.xx files as per the language. We also have a dir called devtests inside of en-indic-exp. Within devtests, we have an 'all' directory which has once again dirs like en-ta, en-te, en-hi etc. The file names inside them are like dev.te/dev.en/test.te/test.en.
We are kicking off with ./prepare_data_joint_training.sh '../en-indic-exp' 'en' 'indic'.
The 'learning target BPE' takes a lot of time. Started off by showing estimate as 500 hours but now it is about 250 hours.
learning target BPE
0%| | 18/32000 [10:55<249:34:01, 28.09s/it]
After this, the reminder of the steps are also more or less slow. The fairseq-preprocess and fairseq-train are almost impossible and I never reached the finish line, despite trying for the past couple of weeks.
Do these processes take so much time? Is my machine configuration even close to the mark or better or much worse?
Hope someone can point me in the right direction.
Beta Was this translation helpful? Give feedback.
All reactions