Layer-Wise Learning Rate #984
-
Is it possible to do Layer-Wise Learning Rate in Skorch similar to what is explained here for Pytorch? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
I tried to write a customized optimizer and pass it to the classifier as:
I'm not sure it s the right way to do it here. |
Beta Was this translation helpful? Give feedback.
-
To implement layer-wise learning rates you can facilitate the param group feature of pytorch optimizers. In essence, you can define parameter group with optimizer-specific attributes (like the learning rate). In PyTorch you would have to provide the actual parameters. Since skorch uses lazy evaluation the parameters are not known before initialization, therefore we provide you with our own version of param groups that matches with the parameter names instead of the actual parameter objects. A basic example from the docs: net = NeuralNet(
my_net,
optimizer__param_groups=[
('embedding.*', {'lr': 0.0}),
('linear0.bias', {'lr': 1}),
],
) Note that [name for name, param in torchvision.models.vit_b_16().named_parameters()] |
Beta Was this translation helpful? Give feedback.
-
Thank you @githubnemo |
Beta Was this translation helpful? Give feedback.
To implement layer-wise learning rates you can facilitate the param group feature of pytorch optimizers. In essence, you can define parameter group with optimizer-specific attributes (like the learning rate). In PyTorch you would have to provide the actual parameters. Since skorch uses lazy evaluation the parameters are not known before initialization, therefore we provide you with our own version of param groups that matches with the parameter names instead of the actual parameter objects.
A basic example from the docs:
Note that
embedding
andlin…