How is the dataset ued to train? #23

Pisces032 · 2024-08-07T02:10:36Z

I'm trying to use PEFT to improve the model. I wonder how AnghaBench_compile.jsonl is used to train.
i noticed declare -a dataset=( "path_to_llm4decompile_data/arrow/part-00000" ) in run_llm4decompile_train.sh, but i can't make out the training process.
Maybe colossalai format hides some details about the model or the training process?
Thank you so much!

The text was updated successfully, but these errors were encountered:

rocky-lq · 2024-08-07T03:21:17Z

Thanks for your interest in our project, we've updated the guidance for preparing Colossal AI training data. Please refer to Prepare the data.

Additionally, we recommend using LLaMA Factory to train the llm4decompile model, as it is more user-friendly. For more details, please visit LLaMA-Factory.

Pisces032 · 2024-08-07T05:33:19Z

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is the dataset ued to train? #23

How is the dataset ued to train? #23

Pisces032 commented Aug 7, 2024

rocky-lq commented Aug 7, 2024 •

edited

Loading

Pisces032 commented Aug 7, 2024

How is the dataset ued to train? #23

How is the dataset ued to train? #23

Comments

Pisces032 commented Aug 7, 2024

rocky-lq commented Aug 7, 2024 • edited Loading

Pisces032 commented Aug 7, 2024

rocky-lq commented Aug 7, 2024 •

edited

Loading