Add special case for saving model when running with ZERO-3 optimisation. #149

Willmish · 2024-09-18T12:00:04Z

Tiny fix so a model sharded with Zero-3 can be saved! (To test if it doesnt also save the ugly placeholder confusing huggingface model loader)

Signed-off-by: Szymon Duchniewicz <[email protected]>

Willmish · 2024-09-18T12:25:35Z

Weird thing is: when saving the model, it saves it fine, but also saves the placeholder model.safetensors file, hierarchy looks like:

config.json
generation_config.json
model-00001-of-00004.safetensors
model-00002-of-00004.safetensors
model-00003-of-00004.safetensors
model-00004-of-00004.safetensors
model.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer_config.json
tokenizer.json

But model.safetensors shouldnt be there! (it causes things like AutoModelForCausalLM.from_pretrained( to fail, unless the placeholder file is removed!

Adamliu1 · 2024-09-18T12:32:23Z

llm_unlearn_ucl/unlearn_harm.py

@@ -468,7 +509,30 @@ def main(args) -> None:
        model = model.merge_and_unload()

    # Save final model.
-    if accelerator.is_local_main_process:
+    # NOTE: special case for zero 3


why are we not handling model saving in the main process? I wonder

@Adamliu1 good q, I basedd this off of: https://huggingface.co/docs/accelerate/usage_guides/deepspeed#saving-and-loading , and I belive its because each process (on each of the 4 GPUs) has a portion of the model that needs to be synced across - hence cannot just unwrap it on a single process.

Signed-off-by: Szymon Duchniewicz <[email protected]>

Willmish and others added 2 commits September 18, 2024 12:56

Add special case for saving model when running with ZERO-3 optimisation.

83239b0

Signed-off-by: Szymon Duchniewicz <[email protected]>

🎨 Format Python code with psf/black

7ac0b9c

Willmish requested review from TheRootOf3 and Adamliu1 and removed request for TheRootOf3 September 18, 2024 12:08

Adamliu1 reviewed Sep 18, 2024

View reviewed changes

Add deepspeed 4 gpu config.

837dd6e

Signed-off-by: Szymon Duchniewicz <[email protected]>

Willmish force-pushed the willmish/zero3_save_model branch from 70ae769 to 837dd6e Compare September 18, 2024 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add special case for saving model when running with ZERO-3 optimisation. #149

Add special case for saving model when running with ZERO-3 optimisation. #149

Willmish commented Sep 18, 2024

Willmish commented Sep 18, 2024

Adamliu1 Sep 18, 2024

Willmish Sep 18, 2024

Add special case for saving model when running with ZERO-3 optimisation. #149

Are you sure you want to change the base?

Add special case for saving model when running with ZERO-3 optimisation. #149

Conversation

Willmish commented Sep 18, 2024

Willmish commented Sep 18, 2024

Adamliu1 Sep 18, 2024

Choose a reason for hiding this comment

Willmish Sep 18, 2024

Choose a reason for hiding this comment