How to extract/save weights after training? #13

minertom · 2020-12-06T01:25:56Z

OK, here I am displaying my utter ignorance again. I did find a post on towards data science entitled "everything you need to know about saving weights in pytorch".

https://towardsdatascience.com/everything-you-need-to-know-about-saving-weights-in-pytorch-572651f3f8de

Now I am stuck. Having saved the weights in the example project, I am aware that the file is not in a human readable format.

So my question now becomes is there a way to take this file of weights which is in pth format and convert it to numpy, which would be human readable? I would like to be able to do further manipulation of the weights in numpy.

Thank You for your patients
Tom

dvgodoy · 2020-12-06T16:43:48Z

Hi Tom,

The saving/load models are used to resume training or deployment, so they are saved in binary format, as they are not intended for being read by human.
The save method is actually transforming the state dictionary into its binary representation for saving. If you want to do anything else, either converting it to numpy arrays or into human-readable text, you can go over the dictionary itself.

For example, let's say you have a simple sequential model:
model = nn.Sequential(nn.Linear(2, 10), nn.ReLU(), nn.Linear(10, 1))

If you check its state dictionary, it goes as expected:
OrderedDict([('0.weight', tensor([[ 0.0485, 0.3305], [ 0.6338, 0.4103], ... [ 0.3358, -0.3827], [-0.4230, 0.2328]])), ('0.bias', tensor([ 0.2907, 0.3352, 0.1105, -0.6123, 0.2566, -0.4548, 0.4116, 0.4219, -0.4997, 0.0397])), ('2.weight', tensor([[-0.2709, 0.0192, 0.0961, -0.0101, -0.3044, 0.2777, 0.0432, 0.0935, -0.2234, -0.0936]])), ('2.bias', tensor([-0.2365]))])

You can also get the state dictionary of any given layer if you wish: model[2].state_dict() will return you only those weights
corresponding to the last layer.

They are all tensors, but you can make them all Numpy arrays:
state_dict = model.state_dict() dict_numpy = {k: state_dict[k].cpu().numpy() for k, v in state_dict.items()}

Or if you want to have them in plain text, you can use JSON:
text_state = json.dumps({k: state_dict[k].tolist() for k, v in state_dict.items()})

Does it help?

In Chapter 5 (which I will publish in a few days), I will introduce a method to visualize the filters (weights) of convolutional layers, and I will also introduce hooks, which you can use to capture the outputs produced by each layer. I think you'll like the next Chapter :-) I'll be looking forward to your feedback on it.

Best,
Daniel

minertom · 2020-12-07T04:07:22Z

Daniel, Thank you for the reply. Wow. Looks great. It will be a couple of days before I get a chance to "grok" this fully, but it makes sense. Yes, I am really looking forward to the next chapter :-) . Regards Tom

…

On Sun, Dec 6, 2020 at 8:44 AM Daniel Voigt Godoy ***@***.***> wrote: Hi Tom, The saving/load models are used to resume training or deployment, so they are saved in binary format, as they are not intended for being read by human. The save method is actually transforming the state dictionary into its binary representation for saving. If you want to do anything else, either converting it to numpy arrays or into human-readable text, you can go over the dictionary itself. For example, let's say you have a simple sequential model: model = nn.Sequential(nn.Linear(2, 10), nn.ReLU(), nn.Linear(10, 1)) If you check its state dictionary, it goes as expected: OrderedDict([('0.weight', tensor([[ 0.0485, 0.3305], [ 0.6338, 0.4103], ... [ 0.3358, -0.3827], [-0.4230, 0.2328]])), ('0.bias', tensor([ 0.2907, 0.3352, 0.1105, -0.6123, 0.2566, -0.4548, 0.4116, 0.4219, -0.4997, 0.0397])), ('2.weight', tensor([[-0.2709, 0.0192, 0.0961, -0.0101, -0.3044, 0.2777, 0.0432, 0.0935, -0.2234, -0.0936]])), ('2.bias', tensor([-0.2365]))]) You can also get the state dictionary of any given layer if you wish: model[2].state_dict() will return you only those weights corresponding to the last layer. They are all tensors, but you can make them all Numpy arrays: state_dict = model.state_dict() dict_numpy = {k: state_dict[k].cpu().numpy() for k, v in state_dict.items()} Or if you want to have them in plain text, you can use JSON: text_state = json.dumps({k: state_dict[k].tolist() for k, v in state_dict.items()}) Does it help? In Chapter 5 (which I will publish in a few days), I will introduce a method to visualize the filters (weights) of convolutional layers, and I will also introduce hooks, which you can use to capture the outputs produced by each layer. I think you'll like the next Chapter :-) I'll be looking forward to your feedback on it. Best, Daniel — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#13 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADHGGHHBH7NSPHZ3ORNZKXTSTOYFDANCNFSM4UO6KBVA> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to extract/save weights after training? #13

How to extract/save weights after training? #13

minertom commented Dec 6, 2020 •

edited

Loading

dvgodoy commented Dec 6, 2020

minertom commented Dec 7, 2020 via email

How to extract/save weights after training? #13

How to extract/save weights after training? #13

Comments

minertom commented Dec 6, 2020 • edited Loading

dvgodoy commented Dec 6, 2020

minertom commented Dec 7, 2020 via email

minertom commented Dec 6, 2020 •

edited

Loading