Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encounting NotImplementedError when finetuning a Single-Task .pth model #4262

Open
njzjz opened this issue Oct 26, 2024 · 1 comment
Open

Comments

@njzjz
Copy link
Member

njzjz commented Oct 26, 2024

i take a pretrained-single-task model for finetuning I just modified the training data part of the model's input.json. The finetune command: dp --pt train input.json -t dpa2.pth --use-pretrain-script.

The error is as follow:

Traceback (most recent call last):
File "/home/yanli/deepmd-kit-v3/bin/dp", line 10, in
sys.exit(main())
^^^^^^
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/deepmd/main.py", line 923, in main
deepmd_main(args)
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/deepmd/pt/entrypoints/main.py", line 562, in main
train(FLAGS)
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/deepmd/pt/entrypoints/main.py", line 267, in train
config["model"], finetune_links = get_finetune_rules(
^^^^^^^^^^^^^^^^^^^
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/deepmd/pt/utils/finetune.py", line 140, in get_finetune_rules
if "model" in state_dict:
^^^^^^^^^^^^^^^^^^^^^
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/torch/jit/_script.py", line 868, in contains
return self.forward_magic_method("contains", key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yanli/deepmd-kit-v3/lib/python3.11/site-packages/torch/jit/_script.py", line 855, in forward_magic_method
raise NotImplementedError()
NotImplementedError

The input.script is here:
input.json

Is this because the pth model is used instead of the pt model?

Originally posted by @darkgezi in #4255

@njzjz
Copy link
Member Author

njzjz commented Oct 28, 2024

Thought: we do need a universal load model API. There are many torch.load and torch.jit.load usages in the code, which are in a mass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant