Replies: 2 comments 2 replies
-
|
Beta Was this translation helpful? Give feedback.
2 replies
-
@iMountTai Hi, I was able to resolve it and now I don't have Overflow issues. Thank you. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
torchrun --nnodes 1 --nproc_per_node 1 run_clm_llama_pretraining_peft.py
--deepspeed ${deepspeed_config_file}
--model_type ${model_type}
--tokenizer_name_or_path ${tokenizer_path_lang_vac}
--dataset_dir ${dataset_dir_2}
--data_cache_dir ${data_cache}
--validation_split_percentage 0.1
--per_device_train_batch_size ${per_device_train_batch_size}
--per_device_eval_batch_size ${per_device_eval_batch_size}
--do_train
--do_eval
--seed 42
--fp16
--num_train_epochs 1
--lr_scheduler_type cosine
--learning_rate ${lr}
--warmup_ratio 0.05
--weight_decay 0.01
--logging_strategy steps
--logging_steps 10
--save_strategy steps
--save_total_limit 2
--save_steps 200
--gradient_accumulation_steps ${gradient_accumulation_steps}
--preprocessing_num_workers 8
--block_size 512
--output_dir ${output_dir}
--overwrite_output_dir True
--ddp_timeout 30000
--logging_first_step True
--lora_rank ${lora_rank}
--lora_alpha ${lora_alpha}
--trainable ${lora_trainable}
--lora_dropout ${lora_dropout}
--torch_dtype bfloat16
--gradient_checkpointing True
--ddp_find_unused_parameters False
Beta Was this translation helpful? Give feedback.
All reactions