SFTTrainer training very slow on GPU. Is this training speed expected? #2378
Unanswered
pledominykas
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am currently trying to perform full fine tuning on the ai-forever/mGPT model (1.3B parameters) using a single A100 GPU (40GB VRAM) on Google Colab. However when running the training is very slow: ~0.06 it/s.
Here is my code:
`
dataset = load_dataset("allenai/c4", "lt")
train_dataset = dataset["train"]
eval_dataset = dataset["validation"]
train_dataset = train_dataset.take(10000)
eval_dataset = eval_dataset.take(1000)
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = train_dataset,
eval_dataset = eval_dataset,
dataset_text_field = "text",
max_seq_length = 2048,
)
trainer_stats = trainer.train()
`
And the trainer output:
It says it will take ~10hrs to process 10k examples from the c4 dataset.
These are the relevant package versions and a screenshot of GPU usage:
`
Package Version
accelerate 0.34.2
bitsandbytes 0.44.1
datasets 3.1.0
peft 0.13.2
torch 2.5.0+cu121
trl 0.12.0
`
It does seem to load the model to the GPU, but for some reason it’s still very slow.
I tried to use keep_in_memory=True when loading the dataset, but it did not help.
I also tried pre-tokenizing the dataset and using Trainer instead of SFTTrainer but the performance was similar.
I was wondering whether this is the expected training speed or is there some issue with my code? And if it is an issue, what could a possible fix be?
Beta Was this translation helpful? Give feedback.
All reactions